Uniquely decodable n-gram embeddings
نویسندگان
چکیده
منابع مشابه
Uniquely decodable n-gram embeddings
We define the family of n-gram embeddings from strings over a finite alphabet into the semimodule N . We classify all ∈ N that are valid images of strings under such embeddings, as well as all whose inverse image consists of exactly 1 string (we call such uniquely decodable). We prove that for a fixed alphabet, the set of all strings whose image is uniquely decodable is a regular language. © 20...
متن کاملEfficient, Compositional, Order-sensitive n-gram Embeddings
We propose ECO: a new way to generate embeddings for phrases that is Efficient, Compositional, and Order-sensitive. Our method creates decompositional embeddings for words offline and combines them to create new embeddings for phrases in real time. Unlike other approaches, ECO can create embeddings for phrases not seen during training. We evaluate ECO on supervised and unsupervised tasks and de...
متن کاملOn the Characterization of Linear Uniquely Decodable Codes
A Uniquely Decodable (UD) Code is a code such that any vector of the ambient space has a unique closest codeword. In this paper we begin a study of the structure of UD codes and identify perfect subcodes. In particular we determine all linear UD codes of covering radius ≤ 2.
متن کاملUnsupervised Learning of Sentence Embeddings using Compositional n-Gram Features
The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsuper...
متن کاملSet of uniquely decodable codes for overloaded synchronous CDMA
In this paper, we consider the designing of a new set of Uniquely Decodable Codes (UDC) for uncoded synchronous overloaded Code Division Multiple Access (CDMA) for the number of codes exceeding the assigned code length. For the construction, the proposed recursive method at iterationk generates a matrix that can be classified into k orthogonal subsets of different dimensions. Out of them, all b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Theoretical Computer Science
سال: 2004
ISSN: 0304-3975
DOI: 10.1016/j.tcs.2004.10.010